Skip to content

Conversation

kmruiz
Copy link
Collaborator

@kmruiz kmruiz commented Oct 8, 2025

Proposed changes

This PR adds a new session level service called VectorSearchEmbeddings, that is responsible of:

  1. Understanding if Atlas Search is available
  2. Retrieving which fields of a collection are embeddings, based on Atlas Search index definitions.
  3. Validating that, given a document, it's valid according to the embedding definitions.

Given that the embedding combinations and detection can be inaccurate, we also provide a new configuration option called "disableEmbeddingsValidation" that can be set up by CLI/Env and when true, the validation is bypassed.

This PR also introduces the embedding validation in the insertMany tool, so users can not randomly add data that can break
existing models or indexes unknowingly.

We depend on #628 to be merged, as it implements a method to detect if Atlas Search is available. Whenever the PR is merged, I'll refactor the method introduced there and use VectorSearchEmbeddings, so we have only one single place for search detection.

Checklist

@kmruiz kmruiz self-assigned this Oct 8, 2025
@kmruiz kmruiz marked this pull request as ready for review October 9, 2025 16:01
@kmruiz kmruiz requested a review from a team as a code owner October 9, 2025 16:01
@Copilot Copilot AI review requested due to automatic review settings October 9, 2025 16:01
Copilot

This comment was marked as outdated.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@kmruiz kmruiz marked this pull request as draft October 9, 2025 16:10
@kmruiz kmruiz requested a review from Copilot October 13, 2025 16:35
@kmruiz kmruiz marked this pull request as ready for review October 13, 2025 16:36
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 23 out of 23 changed files in this pull request and generated 2 comments.

@kmruiz kmruiz requested a review from nirinchev October 16, 2025 08:35
@coveralls
Copy link
Collaborator

coveralls commented Oct 16, 2025

Pull Request Test Coverage Report for Build 18556878086

Details

  • 210 of 227 (92.51%) changed or added relevant lines in 15 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.2%) to 82.747%

Changes Missing Coverage Covered Lines Changed/Added Lines %
src/common/search/vectorSearchEmbeddingsManager.ts 119 126 94.44%
src/tools/mongodb/mongodbTool.ts 20 30 66.67%
Totals Coverage Status
Change from base Build 18552067365: 0.2%
Covered Lines: 5950
Relevant Lines: 7059

💛 - Coveralls

Copy link
Collaborator

@himanshusinghs himanshusinghs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@kmruiz kmruiz merged commit 494070f into main Oct 16, 2025
15 of 17 checks passed
@kmruiz kmruiz deleted the chore/mcp-246 branch October 16, 2025 10:51
nirinchev added a commit that referenced this pull request Oct 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants